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ABSTRACT 

Ideally,  health  monitoring  of  new,  complex 
engineering  systems  should  occur  from  initial 
operation  to  decommissioning.  Health 
monitoring  typically  involves  a  suite  of 
modules,  including  system  monitoring,  fault 
detection,  fault  diagnostics,  and  system 
prognostics.  However,  for  systems  which 
have  not  yet  operated,  this  is  challenging. 
Most  available  health  monitoring  modules  are 
empirically  based,  meaning  they  are  derived 
from  available  historic  data.  For  new  system 
designs,  such  data  simply  does  not  exist.  This 
research  proposes  an  adaptive  modeling 
system  which  initially  builds  empirical  models 
from  high-fidelity  simulated  data.  This  data 
suffers  from  the  common  problems  of  data 
simulation  caused  by  complicated  physical 
models  mechanisms  and  simplifying 
assumptions  made  in  model  development.  As 
actual  system  data  becomes  available,  the 
empirical  models  adapt  in  an  automated  and 
intelligent  way  to  account  for  real-world, 
nominal  data  relationships. 

A  key  challenge  in  automatically  adaptive 
empirical  models  lies  in  differentiating 
between  faulted  operation  and  nominal 
operation  which  is  not  well-described  by  the 
physics-based  data.  Nominal  operation  may 
extend  beyond  the  simulated  data  for  many 
reasons:  the  system  may  be  operating  in  un¬ 
anticipated  environments;  the  assumptions 
made  in  model  development  may  cause 
inaccuracies  in  the  data;  or  the  relationships 
modeled  may  simply  be  incorrect.  Traditional 
fault  detection  methods  such  as  those  using  the 
sequential  probability  ratio  test  are  not  able  to 
distinguish  between  unexpected  nominal 
operation  and  truly  faulted  operation. 
However,  the  main  benefit  of  using  adaptive 
models  lies  in  their  ability  to  accurately  learn 


expanded  nominal  relationships  while 
detecting  and  differentiating  faulted 
conditions.  For  the  purposes  of  accurately 
adapting  a  monitoring  system,  a  principal 
component-based  method  is  proposed  to 
distinguish  between  these  two  cases. 

As  faults  are  detected,  fault  diagnostics 
and  system  prognostics  are  employed  to 
provide  a  complete  health  monitoring  system. 

The  proposed  adaptive  monitoring  system  is 
applied  to  simulated  data  of  the  newly 
designed  International  Reactor  Innovative  and 
Secure  (IRIS)  nuclear  plant.  * 

1.  INTRODUCTION 

Development  of  traditional  health  monitoring  systems 
requires  either  large  amounts  of  operational  data 
spanning  all  expected  operating  conditions  or  high 
fidelity  first  principle  models  (FPMs)  which  capture  the 
physics  of  failure  relationships.  However,  in  some 
cases  neither  of  these  is  available.  New  designs  of 
complex  systems  may  be  too  complicated  to  adequately 
model  using  first  principle  approaches,  but  operational 
data  is  not  available  until  the  system  has  been  in  service 
for  some  time.  Monitoring  these  systems  can  be 
challenging.  An  adaptive  monitoring  method  is 
proposed  which  extends  traditional  auto-associative 
kernel  regression  (AAKR)  models.  A  principal 
component  analysis  (PCA)  model  is  used  to 
differentiate  between  faulted  operations  and  expanded 
nominal  operations.  If  a  fault  is  detected,  traditional 
fault  diagnostics  and  prognostics  methods  are  applied 
to  determine  the  fault  type  and  RUL,  respectively. 

This  research  applies  the  proposed  health 
monitoring  system  to  the  new  Westinghouse  designed 
International  Reactor  Innovative  and  Secure  (IRIS) 


*  This  is  an  open-access  article  distributed  under  the  terms  of 
the  Creative  Commons  Attribution  3.0  United  States  License, 
which  permits  unrestricted  use,  distribution,  and  reproduction 
in  any  medium,  provided  the  original  author  and  source  are 
credited. 
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nuclear  plant,  shown  in  Figure  1 .  The  IRIS  design  is  a 
medium- sized  Grid  Appropriate  Reactor  (GAR) 
designed  for  implementation  in  developing  electric 
grids.  These  reactors  have  additional  monitoring 
concerns  over  nuclear  plants  built  in  developed  nations; 
these  needs  include  increased  availability,  longevity 
between  refueling  and  maintenance  cycles,  and  safety 
and  proliferation  resistance.  Additionally,  these 
reactors  are  designed  to  operate  remotely  in  countries 
with  limited  infrastructure  and  skilled  personnel. 


Figure  1 :  IRIS  Reactor  Design 


The  following  section  discusses  the  methodology 
used  to  build  an  adaptive  health  monitoring  system.  An 
application  of  the  system  to  simulations  of  the  IRIS 
reactor  is  given.  Finally,  conclusions  and  areas  of 
ongoing  work  are  outlined. 

2.  METHODOLOGY 

A  full  health  monitoring  system  consists  of  several 
modules,  as  shown  in  Figure  2  (Callan  et  al.,  2006; 
Jardine  et  al.,  2006;  Kothamasu  et  al.,  2006).  Data 
collected  from  a  system  of  interest  is  monitored  for 
deviations  from  normal  behavior.  Monitoring  can  be 
accomplished  through  a  variety  of  methods,  including 
FPMs,  empirical  models,  and  statistical  analysis  (Hines 
et  al.,  2006).  The  monitoring  module  can  be 
considered  an  error  correction  routine;  the  model  gives 
its  best  estimate  of  the  true  value  of  the  system 
variables.  These  estimates  are  compared  to  the  data 
collected  from  the  system  to  generate  a  time- series  of 
residuals.  Residuals  characterize  system  deviations 
from  normal  behavior  and  can  be  used  to  determine  if 
the  system  is  operating  in  an  abnormal  state.  A 
common  test  for  anomalous  behavior  is  the  Sequential 
Probability  Ratio  Test  (SPRT)  (Wald,  1945).  This 
statistical  test  considers  a  sequence  of  residuals  and 
determines  if  they  are  more  likely  from  the  distribution 
that  represents  normal  behavior  or  a  faulted 
distribution,  which  may  have  a  shifted  mean  value  or 


altered  standard  deviation  from  the  nominal 
distribution.  If  a  fault  is  detected,  it  is  often  important 
to  identify  the  type  of  fault;  systems  will  likely  degrade 
in  different  ways  depending  on  the  type  of  fault,  and 
different  prognostic  models  will  be  needed.  Fault 
diagnostic  results  are  used  to  identify  the  appropriate 
prognostic  model.  Expert  systems,  such  as  fuzzy  rule- 
based  systems,  are  common  fault  diagnosers.  Finally,  a 
prognostic  model  is  employed  to  estimate  the 
Remaining  Useful  Life  (RUL)  of  the  system.  This 
model  may  include  information  from  the  original  data, 
the  monitoring  system  residuals,  and  the  results  of  the 
fault  detection  and  isolation  routines. 


Mitigate 

Prognose  *  How  can  the 
effects  of 

W  Diagnose  #  ™'’a* is  the  failure  ba 

KUL-  mitigated? 

•  What  is  the 
) ,,  ..  fault? 
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and  Detect 

•  Is  there  an 
Data  anomaly  or 

fault? 

Figure  2:  Full  Health  Monitoring  System 

For  monitoring  new  equipment  designs,  the 
traditional  monitoring  models  and  fault  detection 
routines  are  not  sufficient.  An  adaptive  AAKR  model 
is  proposed  which  is  initially  populated  with  data 
simulated  from  a  high  fidelity  physics  model  and 
adapts  to  actual  nominal  operating  data  as  it  is  collected 
while  still  correctly  detecting  faulted  conditions 
(Humberstone  et  al.,  2009a).  The  following  sections 
describe  the  adaptive  non-parametric  model  (ANPM), 
the  expanded  operating  condition  monitoring  method, 
fault  detection  and  diagnostics,  and,  finally,  the 
prognostic  method  used. 

2.1  Adaptive  Monitoring  Models 

The  ANPM  is  developed  as  an  automated  method  for 
hybrid  model  adaptation;  it  acts  as  a  bridge  between 
data  generated  from  FPMs  and  actual  operational  data. 
The  proposed  ANPM  builds  on  the  AAKR  model 
(Hines  et  al.,  2007a).  This  model  is  attractive  for  many 
reasons.  AAKR  is  primarily  an  error-correction 
method;  when  presented  with  a  new  observation,  it 
attempts  to  determine  the  “correct”  sensor  readings 
based  on  previous  experience.  Additionally,  it  is  a  non- 
parametric,  memory-based  model,  which  means  that 
the  model  consists  primarily  of  a  matrix  of  exemplar 
memory  vectors,  X.  The  vectors  contained  in  X  can  be 
chosen  through  a  number  of  different  algorithms: 
vector  ordering,  min-max  selection,  clustering  methods, 
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etc  (Garvey  and  Hines,  2006).  The  goal  of  any  of  these 
methods  is  to  select  a  set  of  memory  vectors  which 
adequately  covers  the  operating  region,  both  in  range 
and  in  intermediary  relationships.  When  a  new 
observation  is  presented  to  the  model,  its  “correct” 
value  is  determined  as  a  weighted  sum  of  the  most 
similar  exemplar  vectors.  These  weights  are  based  on 
the  Euclidean  distance  metric  and  the  Gaussian  kernel; 
the  weight  of  the  ith  exemplar  vector  is  given  by  (1): 


where  Xj  is  the  jth  sensor  value  for  the  new  observation, 
is  the  jth  sensor  value  for  the  ith  exemplar  vector,  and 
h  is  the  kernel  bandwidth.  The  prediction  of  the 
“correct”  vector  of  sensor  values  is  given  by  (2): 


5X 

where  Xt  is  the  ith  exemplar  vector.  Because  the  model 
is  based  entirely  on  the  memory  matrix,  new 
observations  can  be  appended  to  it  in  order  to  be 
included  in  future  calculations.  This  makes  adaptation 
quick  and  straightforward. 

A  three-phase  model  development  method  is 
proposed  for  the  ANPM  (Figure  3).  During  the  first 
phase,  observed  signal  values  are  candidates  for 
replacing  the  simulated  FPM  data.  Because  this 
adaptation  is  automatic,  it  is  crucial  to  determine  if 
observations  that  are  not  well  described  by  the 
simulated  data  are  due  to  faulted  operation  or  expanded 
operating  conditions.  Expanded  conditions  may  occur 
for  many  reasons:  the  system  may  be  operating  outside 
the  expected  region,  the  assumptions  made  in  model 
development  may  be  inaccurate,  or  the  sensor  noise 
may  contaminate  the  nominal  data  to  the  point  of 
appearing  faulted.  The  proposed  PC  A  method  to 
differentiate  between  true  faults  and  expanded 
conditions  is  discussed  in  the  following  section.  After 
model  adaptation  is  complete,  the  data  vectors  from 
FPM  simulation  should  be  completely  replaced  by  the 
observed  data,  resulting  in  an  AAKR  model  built 
entirely  on  nominal  operation  data.  Then,  a  full  fuel 
cycle  (in  the  case  of  a  nuclear  power  plant)  is  suggested 
for  model  validation.  During  this  time,  the 
performance  of  the  model  should  be  closely  evaluated 
to  determine  if  it  has  been  adequately  adapted  from  the 
first  principle  data  to  the  actual  operating  data.  If 
model  performance  is  determined  to  be  poor,  the  model 
should  re-enter  the  adaptation  phase  to  expand  the 
memory  matrix  coverage  of  the  operational  region. 
Finally,  the  third  phase  covers  the  remaining  reactor 
operation  to  the  next  maintenance  activity.  This 
process  may  be  repeated  after  refueling  or  maintenance 


activities  to  adapt  the  model  to  slight  deviations  in 
sensor  relationships  due  to  recalibration  or 
maintenance. 

During  the  first  phase  of  model  development,  new 
observations  of  nominal  operation  are  appended  to  the 
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Figure  3:  Model  Adaptation 

memory  matrix  as  they  are  available.  As  new  exemplar 
vectors  are  added,  it  becomes  necessary  to  remove  the 
simulated  exemplars  so  that  future  predictions  are 
based  solely  on  the  observed  behaviors.  This  is 
accomplished  by  simply  deleting  the  simulated 
exemplars  which  have  been  weighted  most  heavily  in 
the  past,  indicating  that  they  are  most  similar  to  the 
added  observations.  A  new  vector  is  used  to  track  the 
sum  of  the  weights  for  each  of  the  simulated  exemplars. 
At  designated  intervals,  the  simulated  exemplar  with 
the  highest  sum  of  weights  is  deleted;  this  interval  is  set 
such  that  all  of  the  simulated  exemplars  are  deleted  by 
the  end  of  the  adaptation  phase. 

2.2  Fault  Detection  and  Expanded  Condition 
Monitoring 

Fault  detection  within  the  adaptive  framework  is 
particularly  challenging.  The  model  must  be  elastic 
enough  to  allow  for  some  deviation  of  nominal 
operating  data  from  the  simulated  first-principle  data; 
this  is  expected  due  to  inaccuracies  in  the  simulation. 
However,  the  model  must  also  be  able  to  determine 
which  observations  are  the  result  of  faulted  operation  to 
ensure  these  observations  are  not  added  to  the  memory 
matrix.  The  SPRT  is  statistically  shown  to  be  one  of 
the  fastest  methods  for  identifying  deviations  in 
residuals.  However,  in  an  adaptive  framework,  the 
sequential  nature  of  the  SPRT  is  a  hindrance.  Because 
small  faults  may  not  be  identified  immediately,  the 
SPRT  would  allow  the  ANPM  to  adapt  to  a  growing 
fault  without  ever  detecting  it.  This  has  long  been  one 
of  the  major  drawbacks  of  automated  adaptation. 
However,  a  principal  component  analysis  (PC A)  based 
method  is  proposed  which  can  determine  if  a  new 
observation  vector  is  nominal,  faulty,  or  the  result  of  an 
expanded  operating  condition.  A  full  discussion  of  the 
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PCA  Expanded  Condition  Monitoring  (ECM)  system  is 
available  in  (Humberstone  et  al.,  2009b). 

PCA  transforms  data  into  an  orthogonal  vector  set 
which  facilitates  dimensionality  reduction.  By 
choosing  the  most  useful  Principal  Components  (PCs), 
generally  considered  those  with  the  most  variance,  the 
data  set  can  be  reduced  to  a  smaller  number  of  inputs. 
When  a  new  observation  is  gathered,  it  is  transformed 
to  the  PC  space  to  determine  if  it  is  consistent  with  the 
data  seen  in  the  past.  Two  metrics  are  used  to 
determine  how  well  a  new  observation  fits  in  to  the  PC 
model:  Hotelling’s  T2  statistic  and  the  Q-statistic. 
Figure  4  shows  a  two-dimensional  PC  model  of  three- 
dimensional  data  to  illustrate  the  two  statistics. 
Hotelling’s  T2  statistic  describes  variation  within  the 
model,  as  shown  in  the  figure  on  the  left.  Conversely, 
the  Q-statistic  measures  the  deviation  of  the  new 
observation  outside  the  model. 


Figure  4:  PCA  Statistics 

The  T2-  and  Q-statistics  can  be  used  to  determine  if 
a  new  observation  is  in  one  of  three  classes:  expected 
nominal  operation,  expanded  nominal  operation,  or 
faulted  operation.  A  PC  model  is  developed  using  the 
simulated  first-principle  data,  and  limits  on  acceptable 
T2-  and  Q-statistic  values  are  determined  from  this  data. 
If  both  statistics  for  a  new  observation  are  within  these 
limits,  the  observation  is  the  result  of  expected  nominal 
operation.  If  the  T2- statistic  is  outside  the  expected 
limit,  but  the  Q-statistic  is  within  its  limit,  the  new 
observation  is  due  to  expanded  nominal  operations. 
This  large  T2- statistic  indicates  that  the  new 
observation  deviates  from  the  center  of  the  model  more 
than  what  has  been  seen  in  the  past,  but  the  acceptable 
Q-statistic  indicates  that  it  is  still  described  by  the 
underlying  relationships  in  the  model.  Finally,  if  the  Q- 
statistic  is  outside  of  its  designated  limit,  regardless  of 
the  value  of  the  T2-statistic,  the  observation  is 
considered  faulted  and  is  not  included  in  the  model 
adaptation.  The  large  Q-statistic  indicates  that  the  new 
observation  deviates  significantly  from  the  model 
relationships  seen  in  the  past.  It  is  important  to  note 
that  traditional  PCA  is  a  linear  data  analysis  technique. 
The  method  described  here  is  applicable  to  data  which 
enjoys  linear  or  nearly-linear  relationships,  at  least 
within  some  region.  The  methodology  may  be 
extended  to  non-linear  systems  through  Kernel  PCA; 


however,  that  is  beyond  the  scope  of  the  current  work. 
The  interested  reader  is  referred  to  (Humberstone, 
2010)  for  more  information  on  the  use  of  Kernel  PCA 
for  non-linear  systems. 

2.3  Fault  Diagnostics 

After  a  fault  has  been  detected  using  the  proposed  PCA 
method,  an  expert  system  or  classification  algorithm 
may  be  used  to  determine  the  fault  type.  The 
diagnostic  system  utilized  in  this  research  uses 
monitoring  system  residuals  to  determine  fault  type. 
The  faulted  system  residuals  are  compared  to  those 
contained  in  a  database  of  historical  residual  signatures 
to  determine  under  which  fault  the  system  is  operating. 
Similar  systems  may  be  built  using  additional  features, 
such  as  the  results  of  the  PCA  fault  detection  routine, 
including  PC  values  or  T2-  and  Q-statistics.  For  the 
current  application,  this  added  complexity  was 
unnecessary. 

2.4  System  Prognostics 

The  final  step  in  the  proposed  health  monitoring  system 
is  prognostics.  The  prognostic  module  contains  a  bank 
of  models,  one  for  each  fault  type.  The  results  of  the 
fault  identification  routine  determine  which  prognostic 
model  is  used  to  make  a  RUE  estimate  for  the  system. 
These  estimates  are  continuously  updated  as  the  system 
runs. 

For  nuclear  power  plants  and  other  expensive, 
safety-critical  systems,  an  individual-based,  or  Type  III 
(Hines  et  al.,  2007b),  prognostic  estimate  is  ideal.  This 
research  utilizes  a  bank  of  General  Path  Model  (GPM) 
prognostic  models.  A  full  prognostic  module  would 
likely  include  many  types  of  prognostic  algorithms 
depending  on  the  fault  type  and  its  progression  to 
failure.  Many  prognostic  algorithms  have  been 
proposed  and  studied  (Kothamasu  et  al.,  2006).  The 
results  presented  here  focus  on  the  GPM  methodology. 

GPM  was  first  proposed  by  Fu  and  Meeker  (1993) 
to  move  traditional  reliability  analysis  from  failure-time 
analysis  to  failure-process  analysis.  The  model  was 
developed  to  capitalize  on  censored  test  units.  It 
attempts  to  track  degradation  as  a  function  of  time  or 
duty  cycles  and  extrapolate  that  degradation  path  to 
some  predefined  critical  failure  threshold,  giving  an 
estimate  of  when  the  unit  would  have  failed  had  testing 
continued.  The  GPM  reliability  methodology  has  a 
natural  extension  to  estimation  of  RUE.  If  degradation 
of  a  system  or  component  can  be  either  directly 
measured  or  inferred,  the  degradation  progression  of  a 
specific  component  can  be  used  to  estimate  its  RUE. 
This  measure  of  degradation  is  termed  a  prognostic 
parameter.  Methods  for  automatically  identifying  an 
optimal  prognostic  parameter  from  data  have  been 


4 


Annual  Conference  of  the  Prognostics  and  Health  Management  Society,  2010 


developed  and  were  utilized  here.  The  interested  reader 
is  referred  to  (Coble,  2010)  for  more  information  on 
prognostic  parameter  identification. 

GPM  analysis  begins  with  some  assumption  of  an 
underlying  functional  form  of  the  degradation  path  for 
a  specific  fault  mode.  The  degradation  of  the  ith  unit  at 
time  tj  is  given  by  (3): 

yij='n(tj,(i),di)+£ij  (3) 

where  0  is  a  vector  of  fixed  (population)  effects,  0,  is  a 
vector  of  random  (individual)  effects  for  the  ith 
component,  and  ~  NfOrfj  is  the  standard 
measurement  error  term.  Application  of  the  GPM 
methodology  involves  several  assumptions.  First,  the 
degradation  data  must  be  describable  by  a  function,  rj\ 
this  function  may  be  derived  from  physics-of-failure 
models  or  from  past  degradation  data.  In  order  to  fit 
this  model,  historical  degradation  data  from  a 
population  of  identical  components  or  systems  must  be 
available,  or  appropriate  data  may  be  simulated.  This 
data  should  be  collected  under  similar  use  (or 
accelerated  test)  conditions  and  should  reasonably  span 
the  range  of  individual  variations  between  components. 
Because  GPM  uses  degradation  measures  instead  of 
failure  times,  it  is  also  not  necessary  that  all  historical 
units  are  run  to  failure;  censored  data  contains 
information  useful  to  GPM  forecasting.  The  final 
assumption  of  the  GPM  model  is  that  there  exists  some 
defined  critical  level  of  degradation,  D,  beyond  which  a 
component  no  longer  meets  its  design  specifications, 
i.e.  the  component  has  failed.  Therefore,  some 
components  should  be  run  to  failure  in  order  to  quantify 
this  degradation  level.  Alternatively,  engineering 
judgment  may  be  used  if  the  nature  of  the  degradation 
parameter  is  explicitly  known. 

As  data  is  collected  on  a  faulted  unit,  the  GPM  may 
be  used  to  estimate  the  RUL,  as  shown  in  Figure  5. 
Here,  the  known  parametric  function  is  fit  to  the 
available  degradation  data  to  give  a  unit- specific 
prognostic  model.  The  fitted  parametric  model  is  then 
extrapolated  to  the  degradation  threshold  to  give  an 
estimated  failure  time  and  corresponding  RUL.  For 
systems  with  very  little  data  or  significant  noise 
contamination,  Bayesian  methods  may  be  used  to 
include  prior  information  in  the  model  fit.  Including 
this  information  helps  “force”  the  fitted  parametric 


Figure  5:  Parameter  Trending  and  RUL  Estimation 

model  to  take  the  shape  seen  in  previous  cases.  For  a 
complete  discussion  of  Bayesian  updating  methods  in 
the  GPM,  the  interested  reader  is  referred  to  (Coble  and 
Hines,  2009). 

The  following  section  presents  the  application  of  the 
adaptive  health  monitoring  system  to  the  IRIS  plant 
design. 

3.  APPLICATION  AND  RESULTS 

The  proposed  adaptive  monitoring  and  health 
management  architecture  was  applied  to  the  IRIS 
nuclear  plant  design.  The  IRIS  design  is  a 
Westinghouse  generation  IV  small-  to  medium- size 
modular  reactor  design  that  has  a  proposed  335MW 
output.  The  IRIS  reactor  is  an  integral  Pressurized 
Water  Reactor  with  eight  helical  coil  steam  generators. 
There  are  a  number  of  advantages  that  the  IRIS  reactor 
has  over  traditional  pressurized  water  reactors;  most  of 
these  are  safety  related  benefits  necessary  for  remote 
operation  in  developing  nations.  Because  the  IRIS 
reactor  is  still  in  the  design  stage,  no  actual  operation 
data  is  available.  However,  two  simulators  have  been 
developed.  Data  obtained  from  those  simulations  is 
used  in  this  research.  Because  the  two  simulators  do 
not  track  the  same  set  of  sensors,  model  development 
and  evaluation  was  performed  in  two  phases.  First,  the 
nominal  condition  adaptation  of  the  ANPM  was  tested 
using  a  reduced  number  of  sensors  to  utilize  both 
simulators.  Then,  the  PCA-based  expanded  operation 
monitoring  system  was  tested  using  the  full  data  set 
from  the  high-fidelity  simulation.  The  two  simulators 
and  ANPM  results  are  described  next. 

3.1  ANPM  Results 

The  first  simulator,  considered  to  be  a  low-fidelity 
simulator,  was  built  by  researchers  at  the  University  of 
Tennessee  using  MATLAB  Simulink®  (Li  et  al., 
2009).  The  Simulink  model  is  a  modular  model  which 


5 


Annual  Conference  of  the  Prognostics  and  Health  Management  Society,  2010 


includes  the  reactor  core,  the  helical  coil  steam 
generators,  and  the  balance  of  plant.  In  testing  the 
adaptation  phase  of  the  ANPM  model,  the  results  of 
this  simulator  are  considered  the  first-principle  model 
results  and  are  used  to  seed  the  original  memory 
matrix. 

The  second  simulator  used  was  developed  by  Dr. 
Michael  Doster  at  North  Carolina  State  University 
(NCSU).  NCSU  has  demonstrated  experience  in 
developing  high  fidelity,  full  plant  simulators  for 
predicting  the  dynamic  response  of  pressurized  water 
reactors  during  normal  and  off-normal  operational 
conditions.  An  IRIS  specific  simulator  has  been 
developed  which  includes  a  model  of  the  IRIS 
Pressurizer,  a  six  delayed  neutron  group  kinetics  model, 
a  decay  heat  model  and  a  hot  channel/Departure  from 
Nucleate  Boiling  (DNB)  model.  In  the  IRIS  design, 
the  steam  generators  are  helical  coils,  where  the 
secondary  fluid  flows  on  the  tube  side  of  the  heat 
exchanger.  Detailed  models  have  been  developed  to 
describe  the  dynamics  of  steam  generators  of  this 
design.  For  testing  model  adaptation,  the  results  of 
this  simulator  are  used  as  the  high  fidelity, 
“operational”  data. 

The  two  simulators  share  five  common  sensor 
measurements:  hot  leg  temperature,  cold  leg 
temperature,  feed  flow  rate  and  steam  flow  rate  per 
steam  generator,  and  feedwater  temperature.  The 
adaptation  phase  of  the  ANPM  utilized  these  five, 
highly  correlated  sensors  to  illustrate  adaptation  to 
nominal  operating  data.  Both  simulators  generated  data 
for  a  load-following  power  profile,  which  is  more  likely 
in  a  GAR  than  steady-state  operation.  The  load  profile 
used  is  shown  in  Figure  6.  Figure  7  shows  the  Steam 
Flow  Rate  residuals  for  three  models.  The  worst 
performer  is  the  model  based  solely  on  first-principle 
generated  data.  The  figure  shows  that  this  model  was 
not  able  to  predict  the  behavior  during  periods  of  lower 
power  demand.  The  ANPM  residual  remains  centered 
around  zero,  indicating  that  it  was  able  to  adapt  from 
the  low-fidelity  first  principle  model  to  a  more  accurate 
model.  Finally,  the  “final”  model  is  the  result  of  the 
adaptation  phase,  a  model  which  has  completely 
adapted  to  the  collected  data  and  contains  no  simulated 
exemplars.  This  final  model  has  the  lowest  residuals, 
slightly  better  than  the  adapting  ANPM  model.  The 
mean  squared  errors  of  the  predictions  for  all  models 
and  each  of  the  five  sensors  are  shown  in  Table  1 . 


time  (s)  x  1  o“ 


Figure  6:  Load-Following  Power  Profile 


Figure  7:  Model  Residuals  for  Steam  Flow  Rate 
Table  1:  Mean  Squared  Error  of  Predictions 


Sensor  Number 

1 

2 

3 

4 

5 

FPM 

4.27 

0.178 

0.899 

0.898 

0.066 

ANPM 

0.0011 

0.0011 

0.001 

0.001 

0.0002 

Final 

0.00037 

0.00027 

0.00066 

0.00069 

0.00016 

These  results  indicate  that  the  ANPM  adapts 
correctly  from  low-fidelity  simulated  data  to  high- 
fidelity  operating  condition  data.  However,  the  ANPM 
must  be  able  to  identify  observations  which  result  from 
faulted  behavior  in  order  to  exclude  these  observations 
from  the  adapting  memory  matrix.  To  test  this  feature, 
a  larger  model  was  built  using  thirteen  highly 
correlated  sensors  from  the  NCSU  simulator.  This 
model  was  used  with  the  proposed  PCA-based  ECM 
methodology  to  identify  known  faulty  data.  The  NCSU 
simulator  was  used  to  generate  data  with  a  heat 
exchanger  fouling  fault.  The  generated  data  includes 
one  day  per  month  of  operation  under  the  nominal  load 
profile  for  twelve  months  with  increasing  fouling  levels 
at  each  month,  ranging  from  1.4%  fouling  to  30% 
fouling.  The  results  of  the  PCA-based  ECM  for  the 
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first  month  are  shown  in  Figure  8 .  This  included  only 
1.4%  heat  exchanger  fouling  and  was  not  detectable  as 
a  fault.  However,  the  second  month  of  this  fault,  which 
included  3.3%  fouling,  was  identified  as  faulty  (Figure 
9).  For  each  month  following,  the  PCA-based  ECM 
was  able  to  correctly  identify  the  faulted  data. 


T2  stats  for  NCSU  IRIS  data 
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Figure  8:  PCA-based  ECM  for  Month  1 
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Figure  9:  PCA-based  ECM  for  Month  2 

After  determining  through  the  proposed  ECM 
method  or  traditional  fault  detection  methods  that  a 
fault  is  present,  the  health  monitoring  system  would 
normally  turn  to  a  fault  diagnostic  routine  to  identify 
the  type  of  fault  experienced.  Due  to  time  constraints 
and  the  run  time  of  the  NCSU  simulator,  only  one  fault 
type  is  currently  available.  Integration  of  a  diagnostic 
routine  is  an  area  of  ongoing  work  as  additional  data 
becomes  available.  The  final  step  after  detecting  and 
diagnosing  a  fault  is  system  prognostics.  The  results  of 


a  prognostic  model  to  track  heat  exchanger  fouling  are 
given  next. 

3.2  Prognostics 

A  GPM  prognostic  model  was  used  to  estimate  RUL 
for  heat  exchanger  fouling  faults.  This  failure  mode 
presents  an  interesting  case.  When  the  plant  is 
operating  in  a  lower  power  demand,  the  heat  exchanger 
is  not  stressed  as  highly,  and  the  effects  of  the  fault  are 
somewhat  muted.  However,  when  only  the  periods  of 
high  power  operation  and  the  resulting  residuals  are 
considered,  the  fault  has  a  very  noticeable  trend  in  the 
Steam  Generator  Exit  Temperature,  as  shown  in  Figure 
10.  Again,  due  to  constraints  of  the  simulator,  only  one 
example  of  heat  exchanger  degradation  is  available. 
However,  this  example  was  used  to  generate  additional 
degradation  paths  to  develop  a  GPM  model. 


Figure  10:  Steam  Generator  Exit  Temperature  High 
Power  Residuals 

One  hundred  generated  degradation  paths  were  used 
to  develop  a  GPM  model,  which  the  simulated  data 
shown  above  was  applied  to  for  testing.  The  RUL 
estimation  results  are  shown  in  Figure  11.  The  blue 
line  shows  the  actual  system  RUL  over  thirteen  months 
of  operation.  It  is  traditional  to  consider  the  RUL  at  any 
time  before  failure  to  be  simply  the  time  between  the 
current  time  and  the  failure  time.  Klinger  (1992)  shows 
that  the  assumption  that  failure  time  is  inversely 
proportional  to  the  degradation  rate  is  valid  for  systems 
which  are  autonomous  in  the  variable  representing 
time.  The  inverse  relationship  should  be  accepted  for 
systems  running  under  an  effectively  constant  load  and 
environment,  which  is  often  seen  in  nuclear  power 
plants.  During  the  first  month,  operation  is  nominal 
and  no  prognostic  estimate  is  made.  As  discussed 
previously,  during  the  first  month  of  faulted  operation, 
the  fault  detection  routine  does  not  detect  the  small 
degradation  of  the  heat  exchanger;  again,  no  prognostic 
model  is  activated.  Finally,  at  the  second  month  of 


7 


Annual  Conference  of  the  Prognostics  and  Health  Management  Society,  2010 


faulted  operation,  the  fault  is  detected  and  the 
prognostic  algorithm  is  activated  to  make  RUL 
estimates.  The  prognostic  model  performs  well,  with 
RUL  estimates  within  5%  of  the  actual  RUL  by  the  7th 
month  of  faulted  operation,  five  months  before  failure. 
As  new  faults  are  detected  and  identified,  a  bank  of 
prognostic  models  can  be  developed  to  account  for 
each  of  the  expected  faults.  During  actual  operation  of 
similar  IRIS  plants,  if  unforeseen  faults  occur,  this 
information  can  also  be  incorporated  into  the  fault 
diagnostic  and  prognostic  modules.  In  this  way,  the 
health  monitoring  system  will  continue  to  adapt  as 
operating  data  becomes  available  and  the  entire  fleet  of 
IRIS  plants  can  leverage  the  experiences  of  each 
individual  reactor.  In  an  actual  prognostic  system,  a 
Type  I,  or  reliability-based,  prognostic  model  can  be 
used  to  estimate  the  system  RUL  before  a  fault  has 
been  detected  and  identified.  This  will  facilitate  full 
life  cycle  prognostics  from  beginning  of  operation, 
through  fault  detection  and  identification,  to  system 
failure. 


Figure  1 1 :  Heat  Exchanger  Fouling  RUL  Estimation 
CONCLUSION 

4. 

Recent  efforts  have  pushed  health  monitoring 
technologies  for  legacy  nuclear  power  plants  which 
capitalize  on  the  significant  amount  of  data  available 
from  their  operation.  However,  as  new  plants  come 
online  which  differ  significantly  from  current  designs, 
these  monitoring  systems  will  not  be  directly 
applicable.  High-fidelity  first-principle  simulations  are 
available  for  new  plant  designs,  and  the  existing 
empirical  modeling  technology  can  leverage  this  data  to 
provide  system  monitoring  beginning  with  plant  start¬ 
up.  With  the  addition  of  an  automated  adaptation 
method,  these  models  based  on  simulated  first  principle 
data  will  better  learn  the  behaviors  and  operating 
regions  of  a  specific  plant  as  nominal  operational  data 


becomes  available.  The  proposed  ECM  method  helps 
ensure  that  only  nominal  operations  are  learned  and 
faults  are  identified.  This  adaptation  technology  is  not 
only  useful  for  new  plant  designs,  but  could  also  be 
applied  to  restarts  after  refueling  outages  when  sensor 
relationships  routinely  change  slightly  due  to 
recalibration  and  maintenance  activities. 

This  research  presented  key  components  in  a  full 
health  monitoring  system.  Beginning  with  an  adaptive 
monitoring  module,  the  ANPM,  the  ability  to  adapt  to 
nominal  operations  and  to  detect  faults  using  the 
proposed  PCA-based  ECM  method  was  illustrated. 
After  the  detection  of  a  fault,  a  GPM  prognostic  model 
was  applied  to  estimate  system  RUL.  This  information 
could  be  used  to  determine  if  the  plant  can  run  to  the 
next  scheduled  maintenance  cycle  or  if  additional 
maintenance  must  be  planned. 
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NOMENCLATURE 


AAKR 

Auto- Associative  Kernel  Regression 

ANPM 

Adaptive  Non-Parametric  Model 

DNB 

Departure  from  Nucleate  Boiling 

ECM 

Expanded  Condition  Monitoring 

FPM 

First  Principle  Model 

GAR 

Grid  Appropriate  Reactor 

GPM 

General  Path  Model 

IRIS 

International  Reactor  Innovative  and 
Secure 

NCSU 

North  Carolina  State  University 

PC 

Principal  Component 

PCA 

Principal  Component  Analysis 

RUL 

Remaining  Useful  Life 

SPRT 

Sequential  Probability  Ratio  Test 
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